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Molecular Biology - Introduction 
class="introduction" 


Dolly 
the 
sheep 
was the 
first 
cloned 
mammal 


The three letters “DNA” have now become associated with crime solving, 
paternity testing, human identification, and genetic testing. DNA can be 
retrieved from hair, blood, or saliva. With the exception of identical twins, 
each person’s DNA is unique and it is possible to detect differences 
between human beings on the basis of their unique DNA sequence. 


DNA analysis has many practical applications beyond forensics and 
paternity testing. DNA testing is used for tracing genealogy and identifying 
pathogens. In the medical field, DNA is used in diagnostics, new vaccine 


development, and cancer therapy. It is now possible to determine 
predisposition to many diseases by analyzing genes. 


DNA is the genetic material passed from parent to offspring for all life on 
Earth. The technology of molecular genetics developed in the last half 
century has enabled us to see deep into the history of life to deduce the 
relationships between living things in ways never thought possible. It also 
allows us to understand the workings of evolution in populations of 
organisms. Over a thousand species have had their entire genome 
sequenced, and there have been thousands of individual human genome 
sequences completed. These sequences will allow us to understand human 
disease and the relationship of humans to the rest of the tree of life. Finally, 
molecular genetics techniques have revolutionized plant and animal 
breeding for human agricultural needs. All of these advances in 
biotechnology depended on basic research leading to the discovery of the 
structure of DNA in 1953, and the research since then that has uncovered 
the details of DNA replication and the complex process leading to the 
expression of DNA in the form of proteins in the cell. 


Molecular Biology - The Structure of DNA 
By the end of this section, you will be able to: 


¢ Describe the structure of DNA 
e Describe how eukaryotic and prokaryotic DNA is arranged in the cell 


In the 1950s, Francis Crick and James Watson worked together at the 
University of Cambridge, England, to determine the structure of DNA. 
Other scientists, such as Linus Pauling and Maurice Wilkins, were also 
actively exploring this field. Pauling had discovered the secondary structure 
of proteins using X-ray crystallography. X-ray crystallography is a method 
for investigating molecular structure by observing the patterns formed by 
X-rays shot through a crystal of the substance. The patterns give important 
information about the structure of the molecule of interest. In Wilkins’ lab, 
researcher Rosalind Franklin was using X-ray crystallography to understand 
the structure of DNA. Watson and Crick were able to piece together the 
puzzle of the DNA molecule using Franklin's data ({link]). Watson and 
Crick also had key pieces of information available from other researchers 
such as Chargaff’s rules. Chargaff had shown that of the four kinds of 
monomers (nucleotides) present ina DNA molecule, two types were always 
present in equal amounts and the remaining two types were also always 
present in equal amounts. This meant they were always paired in some way. 
In 1962, James Watson, Francis Crick, and Maurice Wilkins were awarded 
the Nobel Prize in Medicine for their work in determining the structure of 
DNA. 


(b) 


Pioneering scientists (a) James Watson and Francis Crick 


are pictured here with American geneticist Maclyn 
McCarty. Scientist Rosalind Franklin discovered (b) the 
X-ray diffraction pattern of DNA, which helped to 
elucidate its double helix structure. (credit a: 
modification of work by Marjorie McCarty; b: 
modification of work by NIH) 


Now let’s consider the structure of the two types of nucleic acids, 
deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). The building 
blocks of DNA are nucleotides, which are made up of three parts: a 
deoxyribose (5-carbon sugar), a phosphate group, and a nitrogenous base 
({link]). There are four types of nitrogenous bases in DNA. Adenine (A) 
and guanine (G) are double-ringed purines, and cytosine (C) and thymine 
(T) are smaller, single-ringed pyrimidines. The nucleotide is named 
according to the nitrogenous base it contains. 
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(a) Each DNA nucleotide is made up of a sugar, a 
phosphate group, and a base. (b) Cytosine and thymine 
are pyrimidines. Guanine and adenine are purines. 


The phosphate group of one nucleotide bonds covalently with the sugar 
molecule of the next nucleotide, and so on, forming a long polymer of 
nucleotide monomers. The sugar—phosphate groups line up in a “backbone” 
for each single strand of DNA, and the nucleotide bases stick out from this 


backbone. The carbon atoms of the five-carbon sugar are numbered 
clockwise from the oxygen as 1’, 2', 3', 4', and 5' (1' is read as “one prime”). 
The phosphate group is attached to the 5' carbon of one nucleotide and the 
3' carbon of the next nucleotide. In its natural state, each DNA molecule is 
actually composed of two single strands held together along their length 
with hydrogen bonds between the bases. 


Watson and Crick proposed that the DNA is made up of two strands that are 
twisted around each other to form a right-handed helix, called a double 
helix. Base-pairing takes place between a purine and pyrimidine: namely, A 
pairs with T, and G pairs with C. In other words, adenine and thymine are 
complementary base pairs, and cytosine and guanine are also 
complementary base pairs. This is the basis for Chargaff’s rule; because of 
their complementarity, there is as much adenine as thymine ina DNA 
molecule and as much guanine as cytosine. Adenine and thymine are 
connected by two hydrogen bonds, and cytosine and guanine are connected 
by three hydrogen bonds. The two strands are anti-parallel in nature; that is, 
one strand will have the 3' carbon of the sugar in the “upward” position, 
whereas the other strand will have the 5' carbon in the upward position. The 
diameter of the DNA double helix is uniform throughout because a purine 
(two rings) always pairs with a pyrimidine (one ring) and their combined 
lengths are always equal. ([link]). 
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DNA (a) forms a double stranded helix, and (b) 
adenine pairs with thymine and cytosine pairs with 
guanine. (credit a: modification of work by Jerome 

Walker, Dennis Myts) 


The Structure of RNA 


There is a second nucleic acid in all cells called ribonucleic acid, or RNA. 
Like DNA, RNA is a polymer of nucleotides. Each of the nucleotides in 
RNA is made up of a nitrogenous base, a five-carbon sugar, and a 
phosphate group. In the case of RNA, the five-carbon sugar is ribose, not 
deoxyribose. Ribose has a hydroxyl group at the 2' carbon, unlike 
deoxyribose, which has only a hydrogen atom ([link]). 


OH H 
Ribose Deoxyribose 


The difference between the ribose 
found in RNA and the deoxyribose 
found in DNA is that ribose has a 
hydroxyl group at the 2' carbon. 


RNA nucleotides contain the nitrogenous bases adenine, cytosine, and 
guanine. However, they do not contain thymine, which is instead replaced 
by uracil, symbolized by a “U.” RNA exists as a single-stranded molecule 
rather than a double-stranded helix. Molecular biologists have named 
several kinds of RNA on the basis of their function. These include 
messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA 
(rRNA)—molecules that are involved in the production of proteins from the 
DNA code. 


How DNA Is Arranged in the Cell 


DNA is a working molecule; it must be replicated when a cell is ready to 
divide, and it must be “read” to produce the molecules, such as proteins, to 
carry out the functions of the cell. For this reason, the DNA is protected and 
packaged in very specific ways. In addition, DNA molecules can be very 
long. Stretched end-to-end, the DNA molecules in a single human cell 
would come to a length of about 2 meters. Thus, the DNA for a cell must be 
packaged in a very ordered way to fit and function within a structure (the 
cell) that is not visible to the naked eye. The chromosomes of prokaryotes 
are much simpler than those of eukaryotes in many of their features ((link]). 
Most prokaryotes contain a single, circular chromosome that is found in an 
area in the cytoplasm called the nucleoid. 


Nucleus 
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Eukaryote Prokaryote 


A eukaryote contains a well-defined nucleus, 
whereas in prokaryotes, the chromosome lies in the 
cytoplasm in an area called the nucleoid. 


The size of the genome in one of the most well-studied prokaryotes, 
Escherichia coli, is 4.6 million base pairs, which would extend a distance of 
about 1.6 mm if stretched out. So how does this fit inside a small bacterial 
cell? The DNA is twisted beyond the double helix in what is known as 


supercoiling. Some proteins are known to be involved in the supercoiling; 
other proteins and enzymes help in maintaining the supercoiled structure. 


Eukaryotes, whose chromosomes each consist of a linear DNA molecule, 
employ a different type of packing strategy to fit their DNA inside the 
nucleus ({link]). At the most basic level, DNA is wrapped around proteins 
known as histones to form structures called nucleosomes. The DNA is 
wrapped tightly around the histone core. This nucleosome is linked to the 
next one by a short strand of DNA that is free of histones. This is also 
known as the “beads on a string” structure; the nucleosomes are the “beads 
and the short lengths of DNA between them are the “string.” The 
nucleosomes, with their DNA coiled around them, stack compactly onto 
each other to form a 30-nm—wide fiber. This fiber is further coiled into a 
thicker and more compact structure. At the metaphase stage of mitosis, 
when the chromosomes are lined up in the center of the cell, the 
chromosomes are at their most compacted. They are approximately 700 nm 
in width, and are found in association with scaffold proteins. 


” 


In interphase, the phase of the cell cycle between mitoses at which the 
chromosomes are decondensed, eukaryotic chromosomes have two distinct 
regions that can be distinguished by staining. There is a tightly packaged 
region that stains darkly, and a less dense region. The darkly staining 
regions usually contain genes that are not active, and are found in the 
regions of the centromere and telomeres. The lightly staining regions 
usually contain genes that are active, with DNA packaged around 
nucleosomes but not further compacted. 


Organization of Eukaryotic Chromosomes 
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These figures illustrate the 
compaction of the eukaryotic 
chromosome. 


Note: 
Concept in Action 
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Watch this animation of DNA packaging. 


Section Summary 


The model of the double-helix structure of DNA was proposed by Watson 
and Crick. The DNA molecule is a polymer of nucleotides. Each nucleotide 
is composed of a nitrogenous base, a five-carbon sugar (deoxyribose), and a 
phosphate group. There are four nitrogenous bases in DNA, two purines 
(adenine and guanine) and two pyrimidines (cytosine and thymine). A DNA 
molecule is composed of two strands. Each strand is composed of 
nucleotides bonded together covalently between the phosphate group of one 
and the deoxyribose sugar of the next. From this backbone extend the bases. 
The bases of one strand bond to the bases of the second strand with 
hydrogen bonds. Adenine always bonds with thymine, and cytosine always 
bonds with guanine. The bonding causes the two strands to spiral around 
each other in a shape called a double helix. Ribonucleic acid (RNA) is a 
second nucleic acid found in cells. RNA is a single-stranded polymer of 
nucleotides. It also differs from DNA in that it contains the sugar ribose, 
rather than deoxyribose, and the nucleotide uracil rather than thymine. 
Various RNA molecules function in the process of forming proteins from 
the genetic code in DNA. 


Prokaryotes contain a single, double-stranded circular chromosome. 
Eukaryotes contain double-stranded linear DNA molecules packaged into 
chromosomes. The DNA helix is wrapped around proteins to form 
nucleosomes. The protein coils are further coiled, and during mitosis and 
meiosis, the chromosomes become even more greatly coiled to facilitate 
their movement. Chromosomes have two distinct regions which can be 
distinguished by staining, reflecting different degrees of packaging and 


determined by whether the DNA in a region is being expressed 
(euchromatin) or not (heterochromatin). 


Glossary 


deoxyribose 
a five-carbon sugar molecule with a hydrogen atom rather than a 
hydroxyl] group in the 2' position; the sugar component of DNA 
nucleotides 


double helix 
the molecular shape of DNA in which two strands of nucleotides wind 
around each other in a spiral shape 


nitrogenous base 
a nitrogen-containing molecule that acts as a base; often referring to 
one of the purine or pyrimidine components of nucleic acids 


phosphate group 
a molecular group consisting of a central phosphorus atom bound to 
four oxygen atoms 


Molecular Biology - DNA Replication 
By the end of this section, you will be able to: 


e Explain the process of DNA replication 
e Explain the importance of telomerase to DNA replication 
e Describe mechanisms of DNA repair 


When a cell divides, it is important that each daughter cell receives an 
identical copy of the DNA. This is accomplished by the process of DNA 
replication. The replication of DNA occurs during the synthesis phase, or S 
phase, of the cell cycle, before the cell enters mitosis or meiosis. 


The elucidation of the structure of the double helix provided a hint as to 
how DNA is copied. Recall that adenine nucleotides pair with thymine 
nucleotides, and cytosine with guanine. This means that the two strands are 
complementary to each other. For example, a strand of DNA witha 
nucleotide sequence of AGTCATGA will have a complementary strand 
with the sequence TCAGTACT ((link]). 
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The two strands of DNA are 
complementary, meaning the sequence of 
bases in one strand can be used to create the 
correct sequence of bases in the other strand. 


Because of the complementarity of the two strands, having one strand 
means that it is possible to recreate the other strand. This model for 
replication suggests that the two strands of the double helix separate during 
replication, and each strand serves as a template from which the new 
complementary strand is copied ((Link]). 


Semi-conservative model of DNA Replication 


The semiconservative model of 
DNA replication is shown. Gray 
indicates the original DNA 
strands, and blue indicates newly 
synthesized DNA. 


During DNA replication, each of the two strands that make up the double 
helix serves as a template from which new strands are copied. The new 
strand will be complementary to the parental or “old” strand. Each new 
double strand consists of one parental strand and one new daughter strand. 
This is known as semiconservative replication. When two DNA copies are 
formed, they have an identical sequence of nucleotide bases and are divided 
equally into two daughter cells. 


DNA Replication in Eukaryotes 


Because eukaryotic genomes are very complex, DNA replication is a very 
complicated process that involves several enzymes and other proteins. It 
occurs in three main stages: initiation, elongation, and termination. 


Recall that eukaryotic DNA is bound to proteins known as histones to form 
structures called nucleosomes. During initiation, the DNA is made 
accessible to the proteins and enzymes involved in the replication process. 
How does the replication machinery know where on the DNA double helix 
to begin? It turns out that there are specific nucleotide sequences called 
origins of replication at which replication begins. Certain proteins bind to 
the origin of replication while an enzyme called helicase unwinds and 
opens up the DNA helix. As the DNA opens up, Y-shaped structures called 
replication forks are formed ((link]). Two replication forks are formed at 
the origin of replication, and these get extended in both directions as 
replication proceeds. There are multiple origins of replication on the 
eukaryotic chromosome, such that replication can occur simultaneously 
from several places in the genome. 


During elongation, an enzyme called DNA polymerase adds DNA 
nucleotides to the 3' end of the template. Because DNA polymerase can 
only add new nucleotides at the end of a backbone, a primer sequence, 
which provides this starting point, is added with complementary RNA 
nucleotides. This primer is removed later, and the nucleotides are replaced 
with DNA nucleotides. One strand, which is complementary to the parental 
DNA strand, is synthesized continuously toward the replication fork so the 
polymerase can add nucleotides in this direction. This continuously 
synthesized strand is known as the leading strand. Because DNA 


polymerase can only synthesize DNA ina 5' to 3' direction, the other new 
strand is put together in short pieces called Okazaki fragments. The 
Okazaki fragments each require a primer made of RNA to start the 
synthesis. The strand with the Okazaki fragments is known as the lagging 
strand. As synthesis proceeds, an enzyme removes the RNA primer, which 
is then replaced with DNA nucleotides, and the gaps between fragments are 
sealed by an enzyme called DNA ligase. 


The process of DNA replication can be summarized as follows: 


1. DNA unwinds at the origin of replication. 

2. New bases are added to the complementary parental strands. One new 
strand is made continuously, while the other strand is made in pieces. 

3. Primers are removed, new DNA nucleotides are put in place of the 
primers and the backbone is sealed by DNA ligase. 


Note: 
Art Connection 
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A replication fork is formed by the opening of the 
origin of replication, and helicase separates the 
DNA strands. An RNA primer is synthesized, and 
is elongated by the DNA polymerase. On the 
leading strand, DNA is synthesized continuously, 
whereas on the lagging strand, DNA is synthesized 
in short stretches. The DNA fragments are joined 
by DNA ligase (not shown). 


You isolate a cell strain in which the joining together of Okazaki fragments 
is impaired and suspect that a mutation has occurred in an enzyme found at 
the replication fork. Which enzyme is most likely to be mutated? 


Telomere Replication 


Because eukaryotic chromosomes are linear, DNA replication comes to the 
end of a line in eukaryotic chromosomes. As you have learned, the DNA 
polymerase enzyme can add nucleotides in only one direction. In the 
leading strand, synthesis continues until the end of the chromosome is 
reached; however, on the lagging strand there is no place for a primer to be 
made for the DNA fragment to be copied at the end of the chromosome. 
This presents a problem for the cell because the ends remain unpaired, and 
over time these ends get progressively shorter as cells continue to divide. 
The ends of the linear chromosomes are known as telomeres, which have 
repetitive sequences that do not code for a particular gene. As a 
consequence, it is telomeres that are shortened with each round of DNA 
replication instead of genes. For example, in humans, a six base-pair 
sequence, TTAGGG, is repeated 100 to 1000 times. The discovery of the 
enzyme telomerase ((link]) helped in the understanding of how 
chromosome ends are maintained. The telomerase attaches to the end of the 
chromosome, and complementary bases to the RNA template are added on 
the end of the DNA strand. Once the lagging strand template is sufficiently 
elongated, DNA polymerase can now add nucleotides that are 
complementary to the ends of the chromosomes. Thus, the ends of the 
chromosomes are replicated. 
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Telomerase has an associated RNA that complements 
the 3’ overhang at the end of the chromosome. 
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The RNA template is used to synthesize the complementary 
strand. J 
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Telomerase shifts, and the process is repeated. 
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The ends of linear chromosomes are 
maintained by the action of the 
telomerase enzyme. 


Telomerase is typically found to be active in germ cells, adult stem cells, 
and some cancer cells. For her discovery of telomerase and its action, 
Elizabeth Blackburn ([link]) received the Nobel Prize for Medicine and 
Physiology in 2009. 


Elizabeth Blackburn, 2009 Nobel 
Laureate, was the scientist who 
discovered how telomerase works. 
(credit: U.S. Embassy, Stockholm, 
Sweden) 


Telomerase is not active in adult somatic cells. Adult somatic cells that 
undergo cell division continue to have their telomeres shortened. This 
essentially means that telomere shortening is associated with aging. In 
2010, scientists found that telomerase can reverse some age-related 
conditions in mice, and this may have potential in regenerative medicine. 
[footnote] Te]gmerase-deficient mice were used in these studies; these mice 
have tissue atrophy, stem-cell depletion, organ system failure, and impaired 
tissue injury responses. Telomerase reactivation in these mice caused 
extension of telomeres, reduced DNA damage, reversed neurodegeneration, 
and improved functioning of the testes, spleen, and intestines. Thus, 
telomere reactivation may have potential for treating age-related diseases in 
humans. 

Mariella Jaskelioff, et al., “Telomerase reactivation reverses tissue 
degeneration in aged telomerase-deficient mice,” Nature, 469 (2011):102—7. 


DNA Replication in Prokaryotes 


Recall that the prokaryotic chromosome is a circular molecule with a less 
extensive coiling structure than eukaryotic chromosomes. The eukaryotic 
chromosome is linear and highly coiled around proteins. While there are 
many similarities in the DNA replication process, these structural 
differences necessitate some differences in the DNA replication process in 
these two life forms. 


DNA replication has been extremely well-studied in prokaryotes, primarily 
because of the small size of the genome and large number of variants 
available. Escherichia coli has 4.6 million base pairs in a single circular 
chromosome, and all of it gets replicated in approximately 42 minutes, 
starting from a single origin of replication and proceeding around the 
chromosome in both directions. This means that approximately 1000 
nucleotides are added per second. The process is much more rapid than in 
eukaryotes. [link] summarizes the differences between prokaryotic and 
eukaryotic replications. 


Differences between Prokaryotic and Eukaryotic Replications 


Property Prokaryotes Eukaryotes 
Origin of replication Single Multiple 
es 1000 50 to 100 
Rate of replication nucleotides/s nucleotides/s 
Chromosome ; . 
circular linear 
structure 


Telomerase Not present Present 


Note: 
Concept in Action 
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Click through a tutorial on DNA replication. 


DNA Repair 


DNA polymerase can make mistakes while adding nucleotides. It edits the 
DNA by proofreading every newly added base. Incorrect bases are removed 
and replaced by the correct base, and then polymerization continues 
({link]a). Most mistakes are corrected during replication, although when 
this does not happen, the mismatch repair mechanism is employed. 
Mismatch repair enzymes recognize the wrongly incorporated base and 
excise it from the DNA, replacing it with the correct base ({link]|b). In yet 
another type of repair, nucleotide excision repair, the DNA double strand 
is unwound and separated, the incorrect bases are removed along with a few 
bases on the 5' and 3' end, and these are replaced by copying the template 
with the help of DNA polymerase ((link]c). Nucleotide excision repair is 
particularly important in correcting thymine dimers, which are primarily 
caused by ultraviolet light. In a thymine dimer, two thymine nucleotides 
adjacent to each other on one strand are covalently bonded to each other 
rather than their complementary bases. If the dimer is not removed and 
repaired it will lead to a mutation. Individuals with flaws in their nucleotide 
excision repair genes show extreme sensitivity to sunlight and develop skin 
cancers early in life. 
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DNA polymerase 


(a) Proofreading 


(c) Nucleotide Excision 


Proofreading by DNA polymerase 
(a) corrects errors during 
replication. In mismatch repair (b), 


the incorrectly added base is 
detected after replication. The 
mismatch repair proteins detect 
this base and remove it from the 
newly synthesized strand by 
nuclease action. The gap is now 
filled with the correctly paired 
base. Nucleotide excision (c) 
repairs thymine dimers. When 
exposed to UV, thymines lying 
adjacent to each other can form 
thymine dimers. In normal cells, 
they are excised and replaced. 


Most mistakes are corrected; if they are not, they may result in a mutation 
—defined as a permanent change in the DNA sequence. Mutations in repair 
genes may lead to serious consequences like cancer. 


Section Summary 


DNA replicates by a semi-conservative method in which each of the two 
parental DNA strands act as a template for new DNA to be synthesized. 
After replication, each DNA has one parental or “old” strand, and one 
daughter or “new” strand. 


Replication in eukaryotes starts at multiple origins of replication, while 
replication in prokaryotes starts from a single origin of replication. The 
DNA is opened with enzymes, resulting in the formation of the replication 
fork. Primase synthesizes an RNA primer to initiate synthesis by DNA 
polymerase, which can add nucleotides in only one direction. One strand is 
synthesized continuously in the direction of the replication fork; this is 
called the leading strand. The other strand is synthesized in a direction away 
from the replication fork, in short stretches of DNA known as Okazaki 
fragments. This strand is known as the lagging strand. Once replication is 


completed, the RNA primers are replaced by DNA nucleotides and the 
DNA is sealed with DNA ligase. 


The ends of eukaryotic chromosomes pose a problem, as polymerase is 
unable to extend them without a primer. Telomerase, an enzyme with an 
inbuilt RNA template, extends the ends by copying the RNA template and 
extending one end of the chromosome. DNA polymerase can then extend 
the DNA using the primer. In this way, the ends of the chromosomes are 
protected. Cells have mechanisms for repairing DNA when it becomes 
damaged or errors are made in replication. These mechanisms include 
mismatch repair to replace nucleotides that are paired with a non- 
complementary base and nucleotide excision repair, which removes bases 
that are damaged such as thymine dimers. 


Art Connections 


Exercise: 


Problem: 


[link] You isolate a cell strain in which the joining together of Okazaki 
fragments is impaired and suspect that a mutation has occurred in an 
enzyme found at the replication fork. Which enzyme is most likely to 
be mutated? 


Solution: 


[link] Ligase, as this enzyme joins together Okazaki fragments. 


Glossary 


DNA ligase 
the enzyme that catalyzes the joining of DNA fragments together 


DNA polymerase 
an enzyme that synthesizes a new strand of DNA complementary to a 
template strand 


helicase 
an enzyme that helps to open up the DNA helix during DNA 
replication by breaking the hydrogen bonds 


lagging strand 
during replication of the 3' to 5' strand, the strand that is replicated in 
short fragments and away from the replication fork 


leading strand 
the strand that is synthesized continuously in the 5' to 3' direction that 
is synthesized in the direction of the replication fork 


mismatch repair 
a form of DNA repair in which non-complementary nucleotides are 
recognized, excised, and replaced with correct nucleotides 


mutation 
a permanent variation in the nucleotide sequence of a genome 


nucleotide excision repair 
a form of DNA repair in which the DNA molecule is unwound and 
separated in the region of the nucleotide damage, the damaged 
nucleotides are removed and replaced with new nucleotides using the 
complementary strand, and the DNA strand is resealed and allowed to 
rejoin its complement 


Okazaki fragments 
the DNA fragments that are synthesized in short stretches on the 
lagging strand 


primer 
a short stretch of RNA nucleotides that is required to initiate 


replication and allow DNA polymerase to bind and begin replication 


replication fork 
the Y-shaped structure formed during the initiation of replication 


semiconservative replication 


the method used to replicate DNA in which the double-stranded 
molecule is separated and each strand acts as a template for a new 
strand to be synthesized, so the resulting DNA molecules are 
composed of one new strand of nucleotides and one old strand of 
nucleotides 


telomerase 
an enzyme that contains a catalytic part and an inbuilt RNA template; 
it functions to maintain telomeres at chromosome ends 


telomere 
the DNA at the end of linear chromosomes 


Molecular Biology - Transcription of DNA 
By the end of this section, you will be able to: 


e Explain the central dogma 
e Explain the main steps of transcription 
e Describe how eukaryotic mRNA is processed 


In both prokaryotes and eukaryotes, the second function of DNA (the first 
was replication) is to provide the information needed to construct the 
proteins necessary so that the cell can perform all of its functions. To do 
this, the DNA is “read” or transcribed into an mRNA molecule. The mRNA 
then provides the code to form a protein by a process called translation. 
Through the processes of transcription and translation, a protein is built 
with a specific sequence of amino acids that was originally encoded in the 
DNA. This module discusses the details of transcription. 


The Central Dogma: DNA Encodes RNA; RNA Encodes 
Protein 


The flow of genetic information in cells from DNA to mRNA to protein is 
described by the central dogma ([link]), which states that genes specify the 
sequences of mRNAs, which in turn specify the sequences of proteins. 


Protein 


The central dogma states that 


DNA encodes RNA, which in turn 
encodes protein. 


The copying of DNA to mRNA is relatively straightforward, with one 
nucleotide being added to the mRNA strand for every complementary 
nucleotide read in the DNA strand. The translation to protein is more 
complex because groups of three mRNA nucleotides correspond to one 
amino acid of the protein sequence. However, as we shall see in the next 
module, the translation to protein is still systematic, such that nucleotides 1 
to 3 correspond to amino acid 1, nucleotides 4 to 6 correspond to amino 
acid 2, and so on. 


Transcription: from DNA to mRNA 


Both prokaryotes and eukaryotes perform fundamentally the same process 
of transcription, with the important difference of the membrane-bound 
nucleus in eukaryotes. With the genes bound in the nucleus, transcription 
occurs in the nucleus of the cell and the mRNA transcript must be 
transported to the cytoplasm. The prokaryotes, which include bacteria and 
archaea, lack membrane-bound nuclei and other organelles, and 
transcription occurs in the cytoplasm of the cell. In both prokaryotes and 
eukaryotes, transcription occurs in three main stages: initiation, elongation, 
and termination. 


Initiation 


Transcription requires the DNA double helix to partially unwind in the 
region of mRNA synthesis. The region of unwinding is called a 
transcription bubble. The DNA sequence onto which the proteins and 
enzymes involved in transcription bind to initiate the process is called a 
promoter. In most cases, promoters exist upstream of the genes they 
regulate. The specific sequence of a promoter is very important because it 


determines whether the corresponding gene is transcribed all of the time, 
some of the time, or hardly at all ([link]). 
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The initiation of transcription begins 
when DNA is unwound, forming a 
transcription bubble. Enzymes and 

other proteins involved in 
transcription bind at the promoter. 


Elongation 


Transcription always proceeds from one of the two DNA strands, which is 
called the template strand. The mRNA product is complementary to the 
template strand and is almost identical to the other DNA strand, called the 
nontemplate strand, with the exception that RNA contains a uracil (U) in 
place of the thymine (T) found in DNA. During elongation, an enzyme 
called RNA polymerase proceeds along the DNA template adding 
nucleotides by base pairing with the DNA template in a manner similar to 
DNA replication, with the difference that an RNA strand is being 
synthesized that does not remain bound to the DNA template. As elongation 
proceeds, the DNA is continuously unwound ahead of the core enzyme and 
rewound behind it ({Llink]). 
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During elongation, RNA polymerase tracks along the 
DNA template, synthesizes mRNA in the 5' to 3 
direction, and unwinds then rewinds the DNA as it is 
read. 


Termination 


Once a gene is transcribed, the prokaryotic polymerase needs to be 
instructed to dissociate from the DNA template and liberate the newly made 
mRNA. Depending on the gene being transcribed, there are two kinds of 
termination signals, but both involve repeated nucleotide sequences in the 
DNA template that result in RNA polymerase stalling, leaving the DNA 
template, and freeing the mRNA transcript. 


On termination, the process of transcription is complete. In a prokaryotic 
cell, by the time termination occurs, the transcript would already have been 
used to partially synthesize numerous copies of the encoded protein because 
these processes can occur concurrently using multiple ribosomes 
(polyribosomes) ([link]). In contrast, the presence of a nucleus in eukaryotic 
cells precludes simultaneous transcription and translation. 


+ Polyribosome 


Multiple polymerases can transcribe a single 
bacterial gene while numerous ribosomes 
concurrently translate the mRNA transcripts 
into polypeptides. In this way, a specific 
protein can rapidly reach a high 
concentration in the bacterial cell. 


Eukaryotic RNA Processing 


The newly transcribed eukaryotic mRNAs must undergo several processing 
steps before they can be transferred from the nucleus to the cytoplasm and 
translated into a protein. The additional steps involved in eukaryotic mRNA 
maturation create a molecule that is much more stable than a prokaryotic 
mRNA. For example, eukaryotic mRNAs last for several hours, whereas the 
typical prokaryotic mRNA lasts no more than five seconds. 


The mRNA transcript is first coated in RNA-stabilizing proteins to prevent 
it from degrading while it is processed and exported out of the nucleus. This 
occurs while the pre-mRNA still is being synthesized by adding a special 
nucleotide “cap” to the 5' end of the growing transcript. In addition to 
preventing degradation, factors involved in protein synthesis recognize the 
cap to help initiate translation by ribosomes. 


Once elongation is complete, an enzyme then adds a string of 
approximately 200 adenine residues to the 3' end, called the poly-A tail. 
This modification further protects the pre-mRNA from degradation and 
signals to cellular factors that the transcript needs to be exported to the 
cytoplasm. 


Eukaryotic genes are composed of protein-coding sequences called exons 
(ex-on signifies that they are expressed) and intervening sequences called 
introns (int-ron denotes their intervening role). Introns are removed from 
the pre-mRNA during processing. Intron sequences in MRNA do not 
encode functional proteins. It is essential that all of a pre-mRNA’s introns 
be completely and precisely removed before protein synthesis so that the 
exons join together to code for the correct amino acids. If the process errs 
by even a single nucleotide, the sequence of the rejoined exons would be 
shifted, and the resulting protein would be nonfunctional. The process of 
removing introns and reconnecting exons is called splicing ({link]). Introns 
are removed and degraded while the pre-mRNA is still in the nucleus. 
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Eukaryotic mRNA contains introns that 
must be spliced out. A 5' cap and 3' tail 
are also added. 


Section Summary 


In prokaryotes, mRNA synthesis is initiated at a promoter sequence on the 
DNA template. Elongation synthesizes new mRNA. Termination liberates 
the mRNA and occurs by mechanisms that stall the RNA polymerase and 
cause it to fall off the DNA template. Newly transcribed eukaryotic mRNAs 
are modified with a cap and a poly-A tail. These structures protect the 


mature MRNA from degradation and help export it from the nucleus. 
Eukaryotic mRNAs also undergo splicing, in which introns are removed 
and exons are reconnected with single-nucleotide accuracy. Only finished 
mRNAs are exported from the nucleus to the cytoplasm. 


Glossary 


exon 
a sequence present in protein-coding mRNA after completion of pre- 
mRNA splicing 


intron 
non—protein-coding intervening sequences that are spliced from 
mRNA during processing 


mRNA 
messenger RNA; a form of RNA that carries the nucleotide sequence 
code for a protein sequence that is translated into a polypeptide 
sequence 


nontemplate strand 
the strand of DNA that is not used to transcribe mRNA; this strand is 
identical to the mRNA except that T nucleotides in the DNA are 
replaced by U nucleotides in the mRNA 


promoter 
a sequence on DNA to which RNA polymerase and associated factors 
bind and initiate transcription 


RNA polymerase 
an enzyme that synthesizes an RNA strand from a DNA template 
strand 


splicing 
the process of removing introns and reconnecting exons in a pre- 
mRNA 


template strand 


the strand of DNA that specifies the complementary mRNA molecule 


transcription bubble 
the region of locally unwound DNA that allows for transcription of 
mRNA 


Molecular Biology - Translation of RNA to make Protein 
By the end of this section, you will be able to: 


e Describe the different steps in protein synthesis 

e Discuss the role of ribosomes in protein synthesis 

e Describe the genetic code and how the nucleotide sequence determines 
the amino acid and the protein sequence 


The synthesis of proteins is one of a cell’s most energy-consuming 
metabolic processes. In turn, proteins account for more mass than any other 
component of living organisms (with the exception of water), and proteins 
perform a wide variety of the functions of a cell. The process of translation, 
or protein synthesis, involves decoding an mRNA message into a 
polypeptide product. Amino acids are covalently strung together in lengths 
ranging from approximately 50 amino acids to more than 1,000. 


The Protein Synthesis Machinery 


In addition to the mRNA template, many other molecules contribute to the 
process of translation. The composition of each component may vary across 
species; for instance, ribosomes may consist of different numbers of 
ribosomal RNAs (rRNA) and polypeptides depending on the organism. 
However, the general structures and functions of the protein synthesis 
machinery are comparable from bacteria to human cells. Translation 
requires the input of an mRNA template, ribosomes, tRNAs, and various 
enzymatic factors ([link]). 
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The protein synthesis machinery 
includes the large and small 
subunits of the ribosome, mRNA, 
and tRNA. (credit: modification of 
work by NIGMS, NIH) 


In E. coli, there are 200,000 ribosomes present in every cell at any given 
time. A ribosome is a complex macromolecule composed of structural and 
catalytic rRNAs, and many distinct polypeptides. In eukaryotes, the 
nucleolus is completely specialized for the synthesis and assembly of 
rRNAs. 


Ribosomes are located in the cytoplasm in prokaryotes and in the cytoplasm 
and endoplasmic reticulum of eukaryotes. Ribosomes are made up of a 
large and a small subunit that come together for translation. The small 
subunit is responsible for binding the mRNA template, whereas the large 
subunit sequentially binds tRNAs, a type of RNA molecule that brings 
amino acids to the growing chain of the polypeptide. Each mRNA molecule 
is simultaneously translated by many ribosomes, all synthesizing protein in 
the same direction. 


Depending on the species, 40 to 60 types of tRNA exist in the cytoplasm. 
Serving as adaptors, specific tRNAs bind to sequences on the mRNA 


template and add the corresponding amino acid to the polypeptide chain. 
Therefore, tRNAs are the molecules that actually “translate” the language 
of RNA into the language of proteins. For each tRNA to function, it must 
have its specific amino acid bonded to it. In the process of tRNA 
“charging,” each tRNA molecule is bonded to its correct amino acid. 


The Genetic Code 


To summarize what we know to this point, the cellular process of 
transcription generates messenger RNA (MRNA), a mobile molecular copy 
of one or more genes with an alphabet of A, C, G, and uracil (U). 
Translation of the mRNA template converts nucleotide-based genetic 
information into a protein product. Protein sequences consist of 20 
commonly occurring amino acids; therefore, it can be said that the protein 
alphabet consists of 20 letters. Each amino acid is defined by a three- 
nucleotide sequence called the triplet codon. The relationship between a 
nucleotide codon and its corresponding amino acid is called the genetic 
code. 


Given the different numbers of “letters” in the mRNA and protein 
“alphabets,” combinations of nucleotides corresponded to single amino 
acids. Using a three-nucleotide code means that there are a total of 64 (4 x 4 
x 4) possible combinations; therefore, a given amino acid is encoded by 
more than one nucleotide triplet ({link]). 
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This figure shows the genetic code for 
translating each nucleotide triplet, or 
codon, in mRNA into an amino acid 

or a termination signal in a nascent 
protein. (credit: modification of work 
by NIH) 


Three of the 64 codons terminate protein synthesis and release the 
polypeptide from the translation machinery. These triplets are called stop 
codons. Another codon, AUG, also has a special function. In addition to 
specifying the amino acid methionine, it also serves as the start codon to 
initiate translation. The reading frame for translation is set by the AUG start 
codon near the 5' end of the mRNA. The genetic code is universal. With a 
few exceptions, virtually all species use the same genetic code for protein 
synthesis, which is powerful evidence that all life on Earth shares a 
common origin. 


The Mechanism of Protein Synthesis 


Just as with mRNA synthesis, protein synthesis can be divided into three 
phases: initiation, elongation, and termination. The process of translation is 


similar in prokaryotes and eukaryotes. Here we will explore how translation 
occurs in E. coli, a representative prokaryote, and specify any differences 
between prokaryotic and eukaryotic translation. 


Protein synthesis begins with the formation of an initiation complex. In E. 
coli, this complex involves the small ribosome subunit, the mRNA 
template, three initiation factors, and a special initiator tRNA. The initiator 
tRNA interacts with the AUG start codon, and links to a special form of the 
amino acid methionine that is typically removed from the polypeptide after 
translation is complete. 


In prokaryotes and eukaryotes, the basics of polypeptide elongation are the 
same, so we will review elongation from the perspective of E. coli. The 
large ribosomal subunit of E. coli consists of three compartments: the A site 
binds incoming charged tRNAs (tRNAs with their attached specific amino 
acids). The P site binds charged tRNAs carrying amino acids that have 
formed bonds with the growing polypeptide chain but have not yet 
dissociated from their corresponding tRNA. The E site releases dissociated 
tRNAs so they can be recharged with free amino acids. The ribosome shifts 
one codon at a time, catalyzing each process that occurs in the three sites. 
With each step, a charged tRNA enters the complex, the polypeptide 
becomes one amino acid longer, and an uncharged tRNA departs. The 
energy for each bond between amino acids is derived from GTP, a molecule 
similar to ATP ([link]). Amazingly, the E. coli translation apparatus takes 
only 0.05 seconds to add each amino acid, meaning that a 200-amino acid 
polypeptide could be translated in just 10 seconds. 
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Translation begins when a tRNA 
anticodon recognizes a codon on 
the mRNA. The large ribosomal 
subunit joins the small subunit, 
and a second tRNA is recruited. 
As the mRNA moves relative to 
the ribosome, the polypeptide 
chain is formed. Entry of a release 
factor into the A site terminates 
translation and the components 
dissociate. 


Termination of translation occurs when a stop codon (UAA, UAG, or UGA) 
is encountered. When the ribosome encounters the stop codon, the growing 
polypeptide is released and the ribosome subunits dissociate and leave the 
mRNA. After many ribosomes have completed translation, the mRNA is 
degraded so the nucleotides can be reused in another transcription reaction. 


Note: 
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Transcribe a gene and translate it to protein using complementary pairing 
and the genetic code at this site. 


Section Summary 


The central dogma describes the flow of genetic information in the cell 
from genes to mRNA to proteins. Genes are used to make MRNA by the 
process of transcription; mRNA is used to synthesize proteins by the 
process of translation. The genetic code is the correspondence between the 
three-nucleotide mRNA codon and an amino acid. The genetic code is 
“translated” by the tRNA molecules, which associate a specific codon with 
a specific amino acid. The genetic code is degenerate because 64 triplet 
codons in mRNA specify only 20 amino acids and three stop codons. This 
means that more than one codon corresponds to an amino acid. Almost 
every species on the planet uses the same genetic code. 


The players in translation include the mRNA template, ribosomes, tRNAs, 
and various enzymatic factors. The small ribosomal subunit binds to the 
mRNA template. Translation begins at the initiating AUG on the mRNA. 
The formation of bonds occurs between sequential amino acids specified by 
the mRNA template according to the genetic code. The ribosome accepts 
charged tRNAs, and as it steps along the mRNA, it catalyzes bonding 
between the new amino acid and the end of the growing polypeptide. The 
entire mRNA is translated in three-nucleotide “steps” of the ribosome. 
When a stop codon is encountered, a release factor binds and dissociates the 
components and frees the new protein. 


Glossary 


codon 
three consecutive nucleotides in mRNA that specify the addition of a 
specific amino acid or the release of a polypeptide chain during 
translation 


genetic code 
the amino acids that correspond to three-nucleotide codons of MRNA 


rRNA 
ribosomal RNA; molecules of RNA that combine to form part of the 
ribosome 


stop codon 
one of the three mRNA codons that specifies termination of translation 


start codon 
the AUG (or, rarely GUG) on an mRNA from which translation 
begins; always specifies methionine 


tRNA 
transfer RNA; an RNA molecule that contains a specific three- 
nucleotide anticodon sequence to pair with the mRNA codon and also 
binds to a specific amino acid 


Molecular Biology - Gene Regulation 
By the end of this section, you will be able to: 


e Discuss why every cell does not express all of its genes 

e Describe how prokaryotic gene expression occurs at the transcriptional 
level 

e Understand that eukaryotic gene expression occurs at the epigenetic, 
transcriptional, post-transcriptional, translational, and post- 
translational levels 


For a cell to function properly, necessary proteins must be synthesized at 
the proper time. All organisms and cells control or regulate the transcription 
and translation of their DNA into protein. The process of turning on a gene 
to produce RNA and protein is called gene expression. Whether in a simple 
unicellular organism or in a complex multicellular organism, each cell 
controls when and how its genes are expressed. For this to occur, there must 
be a mechanism to control when a gene is expressed to make RNA and 
protein, how much of the protein is made, and when it is time to stop 
making that protein because it is no longer needed. 


Cells in multicellular organisms are specialized; cells in different tissues 
look very different and perform different functions. For example, a muscle 
cell is very different from a liver cell, which is very different from a skin 
cell. These differences are a consequence of the expression of different sets 
of genes in each of these cells. All cells have certain basic functions they 
must perform for themselves, such as converting the energy in sugar 
molecules into energy in ATP. Each cell also has many genes that are not 
expressed, and expresses many that are not expressed by other cells, such 
that it can carry out its specialized functions. In addition, cells will turn on 
or off certain genes at different times in response to changes in the 
environment or at different times during the development of the organism. 
Unicellular organisms, both eukaryotic and prokaryotic, also turn on and off 
genes in response to the demands of their environment so that they can 
respond to special conditions. 


The control of gene expression is extremely complex. Malfunctions in this 
process are detrimental to the cell and can lead to the development of many 
diseases, including cancer. 


Prokaryotic versus Eukaryotic Gene Expression 


To understand how gene expression is regulated, we must first understand 
how a gene becomes a functional protein in a cell. The process occurs in 
both prokaryotic and eukaryotic cells, just in slightly different fashions. 


Because prokaryotic organisms lack a cell nucleus, the processes of 
transcription and translation occur almost simultaneously. When the protein 
is no longer needed, transcription stops. As a result, the primary method to 
control what type and how much protein is expressed in a prokaryotic cell is 
through the regulation of DNA transcription into RNA. All the subsequent 
steps happen automatically. When more protein is required, more 
transcription occurs. Therefore, in prokaryotic cells, the control of gene 
expression is almost entirely at the transcriptional level. 


The first example of such control was discovered using E. coli in the 1950s 
and 1960s by French researchers and is called the lac operon. The lac 
operon is a stretch of DNA with three adjacent genes that code for proteins 
that participate in the absorption and metabolism of lactose, a food source 
for E. coli. When lactose is not present in the bacterium’s environment, the 
lac genes are transcribed in small amounts. When lactose is present, the 
genes are transcribed and the bacterium is able to use the lactose as a food 
source. The operon also contains a promoter sequence to which the RNA 
polymerase binds to begin transcription; between the promoter and the three 
genes is a region called the operator. When there is no lactose present, a 
protein known as a repressor binds to the operator and prevents RNA 
polymerase from binding to the promoter, except in rare cases. Thus very 
little of the protein products of the three genes is made. When lactose is 
present, an end product of lactose metabolism binds to the repressor protein 
and prevents it from binding to the operator. This allows RNA polymerase 
to bind to the promoter and freely transcribe the three genes, allowing the 
organism to metabolize the lactose. 


Eukaryotic cells, in contrast, have intracellular organelles and are much 
more complex. Recall that in eukaryotic cells, the DNA is contained inside 
the cell’s nucleus and it is transcribed into mRNA there. The newly 
synthesized mRNA is then transported out of the nucleus into the 


cytoplasm, where ribosomes translate the mRNA into protein. The 
processes of transcription and translation are physically separated by the 
nuclear membrane; transcription occurs only within the nucleus, and 
translation only occurs outside the nucleus in the cytoplasm. The regulation 
of gene expression can occur at all stages of the process ({link]). Regulation 
may occur when the DNA is uncoiled and loosened from nucleosomes to 
bind transcription factors (epigenetic level), when the RNA is transcribed 
(transcriptional level), when RNA is processed and exported to the 
cytoplasm after it is transcribed (post-transcriptional level), when the 
RNA is translated into protein (translational level), or after the protein has 
been made (post-translational level). 
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Eukaryotic gene expression is 
regulated during transcription and 
RNA processing, which take place 

in the nucleus, as well as during 
protein translation, which takes 
place in the cytoplasm. Further 
regulation may occur through 
post-translational modifications of 
proteins. 


The differences in the regulation of gene expression between prokaryotes 
and eukaryotes are summarized in [link]. 


Differences in the Regulation of Gene Expression of Prokaryotic 
and Eukaryotic Organisms 


Prokaryotic 
organisms Eukaryotic organisms 
Lack nucleus Contain nucleus 

e RNA transcription occurs prior to 
RNA transcription protein translation, and it takes place in 
and protein the nucleus. RNA translation to protein 
translation occur occurs in the cytoplasm. 
almost e RNA post-processing includes addition 
simultaneously of a 5' cap, poly-A tail, and excision of 

introns and splicing of exons. 
Gene expression is Gene expression is regulated at many levels 
regulated primarily (epigenetic, transcriptional, post- 
at the transcriptional, translational, and post- 
transcriptional level translational) 
Note: 


Evolution in Action 
Alternative RNA Splicing 


In the 1970s, genes were first observed that exhibited alternative RNA 
splicing. Alternative RNA splicing is a mechanism that allows different 
protein products to be produced from one gene when different 
combinations of introns (and sometimes exons) are removed from the 
transcript ([link]). This alternative splicing can be haphazard, but more 
often it is controlled and acts as a mechanism of gene regulation, with the 
frequency of different splicing alternatives controlled by the cell as a way 
to control the production of different protein products in different cells, or 
at different stages of development. Alternative splicing is now understood 
to be acommon mechanism of gene regulation in eukaryotes; according to 
one estimate, 70% of genes in humans are expressed as multiple proteins 
through alternative splicing. 
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There are five basic modes of alternative 
splicing. Segments of pre-mRNA with exons 
shown in blue, red, orange, and pink can be 
spliced to produce a variety of new mature 
mRNA segments. 


How could alternative splicing evolve? Introns have a beginning and 
ending recognition sequence, and it is easy to imagine the failure of the 
splicing mechanism to identify the end of an intron and find the end of the 
next intron, thus removing two introns and the intervening exon. In fact, 
there are mechanisms in place to prevent such exon skipping, but 
mutations are likely to lead to their failure. Such “mistakes” would more 
than likely produce a nonfunctional protein. Indeed, the cause of many 
genetic diseases is alternative splicing rather than mutations in a sequence. 
However, alternative splicing would create a protein variant without the 
loss of the original protein, opening up possibilities for adaptation of the 
new variant to new functions. Gene duplication has played an important 
role in the evolution of new functions in a similar way—by providing 
genes that may evolve without eliminating the original functional protein. 


Section Summary 


While all somatic cells within an organism contain the same DNA, not all 
cells within that organism express the same proteins. Prokaryotic organisms 
express the entire DNA they encode in every cell, but not necessarily all at 
the same time. Proteins are expressed only when they are needed. 
Eukaryotic organisms express a subset of the DNA that is encoded in any 
given cell. In each cell type, the type and amount of protein is regulated by 
controlling gene expression. To express a protein, the DNA is first 
transcribed into RNA, which is then translated into proteins. In prokaryotic 
cells, these processes occur almost simultaneously. In eukaryotic cells, 
transcription occurs in the nucleus and is separate from the translation that 
occurs in the cytoplasm. Gene expression in prokaryotes is regulated only at 
the transcriptional level, whereas in eukaryotic cells, gene expression is 
regulated at the epigenetic, transcriptional, post-transcriptional, 
translational, and post-translational levels. 


Glossary 


alternative RNA splicing 


a post-transcriptional gene regulation mechanism in eukaryotes in 
which multiple protein products are produced by a single gene through 
alternative splicing combinations of the RNA transcript 


epigenetic 
describing non-genetic regulatory factors, such as changes in 
modifications to histone proteins and DNA that control accessibility to 
genes in chromosomes 


gene expression 
processes that control whether a gene is expressed 


post-transcriptional 
control of gene expression after the RNA molecule has been created 
but before it is translated into protein 


post-translational 
control of gene expression after a protein has been created 


